If b is a very large number, either positive or negative, the logistic curve becomes so steep that
it looks like what mathematicians call a step function, as shown in Figure 18-3b.
© John Wiley & Sons, Inc.
FIGURE 18-3: The first graph (a) shows that when b is negative, the logistic function slopes downward. The second graph (b)
shows that when b is very large, the logistic function becomes a “step function.”
Because the logistic curve approaches the limits 0.0 and 1.0 for extreme values of the
predictor(s), you should not use logistic regression in situations where the fraction of individuals
positive for the outcome does not approach these two limits. Logistic regression is appropriate
for the radiation example because none of the individuals died at a radiation exposure of zero
REMs, and all of the individuals died at doses of 686 REMs and higher. If we imagine a study of
patients with a disease where the outcome is a cure, if taking a drug in very high doses would not
always cause a 100 percent cure, and the disease could resolve on its own without any drug, the
data would not be appropriate. This is because some patients with high doses would still have an
outcome value of 0, and some patients at zero dose would have an outcome value of 1.
Logistic regression fits the logistic model to your data by finding the values of a and b that
make the logistic curve come as close as possible to all your plotted points. With this fitted
model, you can then predict the probability of the outcome. See the later section “Predicting
probabilities with the fitted logistic formula” for more details.
GETTING INTO THE NITTY-GRITTY OF LOGISTIC
REGRESSION
You don’t need to know all the theoretical and computation details for logistic regression because the software will do them
for you. However, you should have a general idea of what it is doing behind the scenes. The calculations are much more
complicated than those for ordinary straight-line or multivariate least-squares regression. In fact, it’s impossible to write down
a set of formulas that give the logistic regression coefficients in terms of the observed X and Y values. The only way to obtain
them is through a complex iterative procedure that would not be practical to do manually.